pyiwn: A Python-based API to access Indian Language WordNets
نویسندگان
چکیده
Indian language WordNets have their individual web-based browsing interfaces along with a common interface for IndoWordNet. These interfaces prove to be useful for language learners and in an educational domain, however, they do not provide the functionality of connecting to them and browsing their data through a lucid application programming interface or an API. In this paper, we present our work on creating such an easy-to-use framework which is bundled with the data for Indian language WordNets and provides NLTK WordNet interface like core functionalities in Python. Additionally, we use a pre-built speech synthesis system forHindi language and augment Hindi data with audios for words, glosses, and example sentences. We provide a detailed usage of our API and explain the functions for ease of the user. Also, we package the IndoWordNet data along with the source code and provide it openly for the purpose of research. We aim to provide all our work as an open source framework for further development.
منابع مشابه
Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets
India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource. Web browser based interfaces are available for these WordNets, but are not suited for mobile devices which deters people from effectively using this resource. We present our initial work on developing mobile applications and browser extensions to access WordNets for Indian Languages. Ou...
متن کاملA Practical Python API for Querying AFLOWLIB
Conrad W. Rosenbrock Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA. (Dated: October 3, 2017) Abstract Large databases such as aflowlib.org provide valuable data sources for discovering material trends through machine learning. Although a REST API and query language are available, there is a learning curve associated with the AFLUX language that acts as a ...
متن کاملBrahmi-Net: A transliteration and script conversion system for languages of the Indian subcontinent
We present Brahmi-Net an online system for transliteration and script conversion for all major Indian language pairs (306 pairs). The system covers 13 Indo-Aryan languages, 4 Dravidian languages and English. For training the transliteration systems, we mined parallel transliteration corpora from parallel translation corpora using an unsupervised method and trained statistical transliteration sy...
متن کاملThe NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource
A new Python API, integrated within the NLTK suite, offers access to the FrameNet 1.7 lexical database. The lexicon (structured in terms of frames) as well as annotated sentences can be processed programatically, or browsed with human-readable displays via the interactive Python prompt.
متن کاملIndoWordNet and its Linking with Ontology
Reasoning about natural language requires combining semantically rich lexical resources with world knowledge, provided by ontologies. In this paper, we describe linking of WordNets of Indian languages with an upper ontology SUMO (Suggested Upper Merged Ontology). This creates multilingual resource for Indian languages which can be used in various natural language processing applications. This p...
متن کامل